Synthetic Data Generation for Custom Policies
When creating custom policies, DynamoGuard leverages Synthetic Data Generation to generate a robust set of training data. This training data covers a variety of datapoint types to simulate meaningful and realistic LLM usage and response scenarios.
Input Data Taxonomy
- In-Domain: Prompts that are relevant to the usage scenario and simulate common user inputs
- Borderline: Prompts that are more challenging to classify as compliant or non-compliant
- Diverse: Prompts that may be out-of-domain that simulate unexpected user inputs
- Jailbreak: Adversarial or tricky prompts that leverage Dynamo's 60+ jailbreaking techniques to challenge the policy model
Output Data Taxonomy
- Aligned: Model responses that are aligned to be compliant with the policy
- Jailbreak: Model responses that are aligned to be non-compliant with the policy
- Neutral: General model responses that are not specifically aligned to be compliant or non-compliant with the policy